Predicting Hospital Exit

Catherine and Gi’s project

Data description

Catherine Ishitani , Gi Kim (Wharton)
March 15, 2021

In this post, we wrangle our hospital data a bit and then describe it, focusing on:

  1. characteristics of exiting hospitals and

  2. elements of our data that may warrant additional consideration in standard ML models.

Data

We’ve added another five years to our data: we now observe the population of U.S. hospitals (~6,300) from 2003-2017.

Our predictor variables are drawn from five categories:

Wrangling

Our data is already tidy, but we spend some time cleaning up the financial variables. HCRIS data are extracted from unaudited financial records submitted to the Centers for Medicare & Medicaid Services (CMS). They’re the most complete source of provider financial data but noisy. Prior literature typically winsorizes outliers and unreasonable values. We also create a net debt and financial growth variables.

#winsorize outliers outside the 1st and 99th percentiles
var_vec = c('ptnt_mgn', 'ni_mgn', 'liquid', 'uncomp', 'uncomp_mgn', 'capex', 'capex_mgn', 'rev_adm', 'tot_assets', 'ptnt_opex', 'oth_costs', 'tot_costs','levg')
clean_p1_p99 <- cleaned %>% group_by(year) %>% summarise_at(vars('ptnt_mgn', 'ni_mgn', 'liquid', 'uncomp', 'uncomp_mgn', 'capex', 'capex_mgn', 'rev_adm', 'tot_assets', 'ptnt_opex', 'oth_costs', 'tot_costs','levg'), .funs=list(p1=~quantile(.,.01,na.rm=TRUE),p99=~quantile(.,.99,na.rm=TRUE)))
clean_p1_p99_merged <- inner_join(cleaned,clean_p1_p99,by='year')

for(varname in var_vec) {
  cleaned$rmflag = (cleaned[{varname}] < clean_p1_p99_merged[glue('{varname}_p1')] | 
                     cleaned[{varname}] > clean_p1_p99_merged[glue('{varname}_p99')] |  
                      is.na(cleaned[{varname}]))
}

cleaned <- cleaned %>% filter(!rmflag)

#create a net debt variable
cleaned <- mutate(hcris, netdebt = debt - cash)

#create growth variables
cleaned <- cleaned %>% group_by(num_prvdr_num) %>% mutate(across(c(rev_tot,rev_netptnt,admtot,net_income,ptnt_income,exptot), list(ch=~(.x-lag(.x))/.x)))

Description

Our key outcome variable is a hospital’s market participation decision in each year. Mapping hospital exits– by closure, acquisition, or conversion– suggests that they occur in rural and urban areas and are not concentrated within any one region.

#here's how we made the above gif 

#plot exiting hospitals & rural counties
p <- plot_usmap(data = rural, values = "code", color = "lightskyblue1", size = .001) + 
  geom_point(data = closures, aes(x = long.1, y = lat.1), color="gray0", shape = 16, size=1) +
  scale_fill_continuous(low = "deepskyblue3", high = "white", name = "Rural", label = scales::comma) + 
  #labs(title = "U.S. hospital closures, 2003-2017") +
  theme(legend.position = "right") 

anim <- p + transition_states(year, transition_length = 0, state_length = 10) +
  enter_fade() +
  exit_fade() +
  ggtitle("U.S. hospital closures, 2003-2017",subtitle='{closest_state}')
animate(anim, duration = 20, fps=5, renderer = magick_renderer())
anim_save("closure_map.gif", anim = last_animation())
How have hospital outcomes changed over time?
cleaned %>%
  tabyl(year, outcome_ex) %>%
  adorn_totals(c("row", "col")) %>%
  adorn_percentages("row") %>% 
  adorn_pct_formatting(digits = 0) %>% 
  adorn_title("combined") %>%
  knitr::kable() %>%
  footnote(general = "")

The rate of hospital exits slowed after 2005– from ~4% to ~1% of hospitals exiting each year. Closure is the most common form of exit; conversion and being acquired and then closed (“absorbed”) are even rarer. About 4% of hospitals are bought each year throughout the period. It is somewhat surprising that neither the Great Recession nor the Affordable Care Act appear to have affected hospitals’ average investment, closure, and entry patterns.

How do hospitals that exit differ from ones that don’t?

Exiting hospitals are located in markets with higher rates of poverty, uninsured, and non-Medicaid expansion states. They are smaller and provide less complex care; they are more likely to be for-profit or public but not system owned. Exiting hospitals are equally likely to be located in rural and urban areas; however, very few small and rural (“Critical Access”) hospitals close. On average, exiting hospitals are within 15 miles of a 100 bed hospital, compared to 17 miles for non-exiting hospitals. They have lower occupancy rates, profit margins, debt and cash levels, and capital expenditures.

pre_out_ex 0 1
num_prvdr_num 264499 274205
tot_pop 699935 765226
white 0.780 0.754
highschool 0.311 0.312
college 0.159 0.147
unempl 0.0737 0.0758
med_inc 55048 52373
uninsur 0.129 0.148
public_insur 0.329 0.329
private_insur 0.662 0.637
elderly 0.146 0.141
poverty 0.151 0.159
male 0.495 0.494
own 2.01 2.17
mcaid_exp 0.137 0.060
wage_index 0.986 0.960
tacmi 1.47 1.23
bought 0.0320 0.0449
sysowned 0.588 0.542
bought_is 0.0153 0.0233
bdtot 161 93
admtot 6573 2681
ipdtot 39229 21088
paytot 52527291 19050288
exptot 1.32e+08 4.60e+07
fte 891 364
teach 0.0548 0.0112
catholic 0.1099 0.0881
cah 0.234 0.052
minorteach 0.231 0.139
rural 0.363 0.367
mcare 0.480 0.487
mcaid 0.164 0.149
vi 0.452 0.322
tot_services 45.8 27.7
tech_services 0.169 0.136
mh_services 0.0669 0.0613
bought_ss 0.0167 0.0216
hsa_sh 0.575 0.451
hosp_hsa 5.74 7.36
hrr_sh 0.0567 0.0220
hosp_hrr 34.7 37.1
dist2hosp 17.3 15.4
ch_tot_serv 0.03038 0.00647
occ 0.573 0.519
age_hosp 27 23
age_sys 7.94 6.55
exit 0 0
enter 0.0202 0.0400
switch 0.0110 0.0184
switch2np 0.00765 0.01041
invest -0.0134 0.0753
cont 0.98 0.96
lat 38.0 36.8
long -92.5 -91.9
uncomp 22.7 25.5
dsh 0.489 0.500
dsh_pct 0.152 0.143
dsh_adj 3.72 1.53
ptnt_opex 157.6 51.9
oth_costs 5.17 2.76
tot_costs 159.5 52.8
rev_tot 170 52
rev_netptnt 156.6 47.8
net_income 7.22 -1.49
ptnt_income -2.26 -3.91
cash 18.65 2.82
debt 24.67 7.61
tot_assets 191.3 46.9
fa_tot 77.2 19.8
capex 10.29 3.23
liquid 28.14 2.72
rev_adm 0.0410 0.0471
levg -10.60 -7.11
ptnt_mgn -364 -146
ni_mgn -364 -146
uncomp_mgn 0.104 0.175
capex_mgn 33.3317 0.0825
outcome_ex 0 0
has_exit 0 1
netdebt 6.02 4.79
rev_tot_ch 0.0128 -0.0818
rev_netptnt_ch 0.00953 -0.09110
admtot_ch -0.0816 -1.4111
net_income_ch 62.8 -29.4
ptnt_income_ch NaN -0.0655
exptot_ch -0.00655 -0.05082
n 77142 1249

Note that in our data, some non-profit hospitals do not exit even after many years of negative profits.

Data considerations

Panel data: We’re looking into ways to accommodate panel data in ML models, e.g., Generalized linear mixed-model (GLMM) trees.

Variable collinearity: Many of our financial variables are highly correlated (blue in the corrgram below), and most variables are autocorrelated. We hope to lean on CART (and other ML algorithms) for variable selection, which should be less problematic than it is for linear models.

Missing years of data: In our data, missing data often occurs during financial distress, or before a hospital exits a market.